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ABSTRACT 


One of the most significant services provided by cloud computing is now cloud storage, from which its 
consumers have reaped numerous benefits. Users can easily use public-key encoding and keyword searching 
to search secured data and retrieve desired data from cloud storage. However, there were some problems with 
speed, accuracy, and security when searching for encrypted keywords. The cross-lingual multi-keyword 
Centroid Merkle search over encrypted data (CLCMSE) suggested in this study is based on the Open 
Multilingual Wordnet and is meant to solve this problem. The proposed CLCMSE technique allows data 
users to query in any language and select the linguistic kind of information provided. First, the Centroid 
technique is used to cluster data from the cloud and sort the clustered data in this process. Then the Merkle 
search technique is used to increase the search speed. Finally, the targeted source data is retrieved from the 
cloud by using the fuzzy information retrieval algorithm. This experimental result proved that our proposed 
CLCMSE has higher accuracy, security, and speed performance than existing methods. This study compares 
the multi-keyword rank search over encrypted cloud data, the multi-keyword rank searchable encryption, and 
the verifiable attribute multi-keyword search. 


Keywords: Cross-Lingual Multi-Keyword, Centroid, Merkle Search, Cloud Storage, Word Net, Encryption, 
Data Retrieval. 


1. INTRODUCTION storage and administration. Some of the 


The semantic relationship between words is used 
by several well-known and trustworthy semantic 
searching algorithms to broaden their searches on 
plain text. Precise matching between the 
keywords in external files is achieved by utilising 
both query terms and enlarged semantically 
related words. Reliable semantic searching has 
three keyword searching schemes: synonym, 
mutual information model, and idea hierarchy[1]. 
Cloud platforms have grown in popularity as 
internet technology has evolved due to their 
massive storage and processing capabilities. By 
uploading data to the cloud system, users from 
various physical places can share resources[2]. 


Individuals and businesses are increasingly 
turning to cloud computing to outsource their 
access to multiple services. Cloud computing 
services can use individuals and companies to 
transfer their information to a shared cloud server, 
which relieves cloud clients of the burden of data 


outsourced data is extremely sensitive. In general, 
searching is hampered when users encrypt data 
traditionally [3][4]. For secure searches over 
encrypted cloud data to be conducted while 
protecting data privacy and confidentiality, 
searching encryption is essential. Though this is 
gaining much research attention, barrier security 
measures in a group-oriented cloud information- 
sharing environment are often not enabled by 
existing searchable encryption algorithms [5][6]. 
Searchable encryption methods were constructed 
in two-sample settings, including the symmetric 
essential configuration and the public key 
configuration, to use the searchable function [7]. 
The recipient's public key is used by the data 
administrator to encrypt the data, which is 
subsequently sent (i.e., ciphertext). The recipient 
can quickly get the encrypted data by 
downloading the ciphertexts that go with it from 
the cloud storage service. Then, they can decrypt 
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the whole set of encrypted business data and get 
the goal data locally [8]. 


Information retrieval and archival have grown in 
importance as multimedia data processing has 
advanced to evaluate real-time data. In this 
instance, searching is a well-known technique 
used on the Internet. Traditional and general data 
retrieval is based on keyword searches, which 
have drawbacks such as a high manpower demand 
and a reliance on personal information, resulting 
in poor simulation results. Secure data retrieval- 
based cloud has lately reached a_ peak 
concentration thanks to the implementation of CC. 
Businesses and consumers adopt the CC for 
storing and managing sensitive information, such 
as photographic collections and individual fitness 
information, because it offers greater convergence 
and financial savings. Until storing data in the 
cloud, data encryption is performed to ensure data 
confidentiality. As a result, traditional encryption 
makes basic data tasks difficult, such as the IR of 
encrypted data. In the case of ciphertext it is 
extremely difficult to reach successful 
information retrieval when securing customer 
information [9]. Without revealing the data 
content or search queries, a searchable interface 
allows a cloud server to recover keywords from 
encrypted data securely. The list of searchable 
encryption techniques is then expanded to include 
searchable symmetric encryption, public-key 
encryption with keyword search (PKES), and 
identity-based encryption with keyword search. 
However, if the information is exchanged with 
several data users, the data owner will need to run 
the encryption algorithm many times in these 
schemes [10]. This work presents a novel 
approach to the Open Multilingual Wordnet 
strategy based on the data searching and retrieval 
plan: the Cross-Lingual Multi-keyword Centroid 
Merkle Search over Encrypted Data (CLCMSE). 
The presented method involves clustering and 
searching-based keyword searches to retrieve the 
data of the input keyword. Centroid Merkle search 
over encrypted data (CLCMSE) scheme is 
discovered to be a successful cloud data retrieval 
method since it eliminates similar data with a low 
retrieving rate. It also offers a fast search and a 
high level of trustworthiness. The paper's 
remaining portion is structured as follows: The 
article's methodology is discussed in Section 3, 
while Section 2 discusses the literature. Section 4 
shows the full system's method. Section 5 
discusses exploratory observations. The sixth area 
concludes the exposition. 


1.1 Background of the study 


It is possible to configure and change a cloud 
system into a cloud storage system when it is 
primarily utilised for data management and 
storage (rather than computation and processing). 
Simply said, cloud storage is any form of online 
storage that users can access from any 
networkable device at any time, anywhere. 
Encryption technologies are frequently used to 
enhance the security of cloud storage because 
people are becoming more worried regarding data 
privacy and the fact that cloud servers are 
vulnerable to menaces from inside and outside of 
the system. The retrieval of ciphertext data may 
provide difficulties even if encryption methods 
can guarantee the privacy of data transferred to the 
cloud. In recent years, figuring out the ways to 
empower data searching while maintaining the 
privacy of sensitive data has been crucial. 

1.2 Problem statement 

The challenge of secure search over encrypted 
data has been a research focus for the past few 
years, is now better understood thanks to the 
development of — searchable encryption 
technology. Nevertheless, the majority of search 
algorithms based on searchable encryption now in 
use only handle specific linguistic queries. The 
limited number of multilingual searchable 
encryption systems that have been implemented 
do not succeed in automatic cross-lingual 
retrieval, which has a_ substantial negative 
influence on the user's search experience. 
Furthermore, the majority of these systems only 
support precise matching and are unable to 
support semantic search. Thus, cross-lingual and 
semantic search implementation together is still 
up for discussion in searchable encryption. We are 
aware of no prior studies that have looked into the 
issue of cross-lingual ranked search over 
encrypted cloud data. Our solution, based on the 
Open Multilingual Wordnet, is a cross-lingual, 
multi-keyword Centroid Merkle search over 
encrypted data (CLCMSE). Language barriers 
can be overcome by our CLCMSE system, which 
also achieves intelligent and tailored search 
through adjustable keyword and _ language 
preference settings. 


2.RELATED WORK 


Ge et al. [11] introduced a ciphertext-policy 
attribute and data sharing (CP AB-KSDS) for 
encrypted cloud 12 data. Because an opponent 
with a 450 rating can select a keyword and create 
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an identical keyword 451 ciphertext, key secrecy 
captures the fact that the tokens containing 
keyword 449 cannot be covered. They suggested 
a concrete scheme and demonstrated that it was 
safe in the random oracle model against 20 chosen 
ciphertext and chosen keyword attacks. Finally, 
the performance and 23 property comparisons 
showed that the proposed construction was 
realistic and practical. 


Liu et al.[12] presented a quick and accurate 
seven-searchable encryption (FASE) technique. 
Furthermore, the two 24 keyword matching 
degrees and the significance score return reliable 
search results and improve the — search 
performance. A quick and precise multi-keyword 
ranking search was achieved by the FASE 
scheme, according to theoretical research and 
analytical findings. However, the FASE system 
still has a lot of issues. Nevertheless, they fail to 
develop a dynamic, searchable encryption system 
that can manage changing circumstances. 


A secure multi-attribute linguistic keyword search 
technique was presented by Zhang et al. [13] over 
encrypted cloud data. This study carried out a 
sufficient number of experiments to test the 
proposed structure’s efficiency. The presented 
approach outperformed was _ superior—linear 
search performance. The delivered systems are 
quite feasible, according to extensive analysis and 
experiment findings. However, it requires less 
computing power for searches, trapdoor creation, 
and configuration. 


A verified top-k searchable encryption cloud data 
(VSED) for dynamic updating activities, 
including adding and removing documents, was 
presented by Elizabeth et al. [14] the analysis's 
findings indicated that the structure was effective 
based on both time and storage complexity. Even 
though the offered method was insecure and 
inefficient in the absence of a SCP, the system 
model that was given contained the Document 
Owner (DO), CSP, the Secure coprocessor (SCP), 
and the Data User (DU). 


Miao et al. [15] provided a feasible attribute-based 
keyword search strategy enabling remote access 
policy in the shared multi-owner setting (ABKS- 
SM) system. In the general bilinear group model, 
the disclosed ABKS-SM systems repelled off-line 
keyword guessing assaults and achieved the 
required security. However, one drawback of the 
suggested ABKS-SM frameworks was that as the 
quantity of framework credits increased, the 
computational and capacity costs also increased. 


Using a fuzzy information technique based on 
Lattice assumptions, Yang et al. [16] 
demonstrated a secure multimedia cloud with 
multi-user capability. The recommended scheme 
guarantees that a gathering of endorsers can look 
at scrambled interactive media information 
without sharing private keys. However, they do 
not focus on planning grid-based accessible 
encryption plans with additional adaptable inquiry 
designs, like boolean, range, and subset inquiry. 


Shen et al. [17] suggested P3, a phrase search 
approach that preserves anonymity for secure 
encrypted data capture in cloud-based IoT. They 
used encryption algorithms and a bilinear chart to 
evaluate a position association of multiple 
approached keywords over encrypted data. The 
presented scheme’s performance and efficiency 
were demonstrated in the experimental evaluation 
results. 


A conjunctive multi-keyword ranked secure 
search approach was presented by Yin et al. [18] 
for different data owners. Comprehensive findings 
demonstrated the accuracy and applicability of the 
suggested approach. 


Dai et al. [19] suggested a recently developed 
privacy-permanent searchable encryption 
technique based on the Latent Dirichlet Allocation 
(LDA) topic model. The proposed topic model 
accelerated the search while decreasing the size of 
the vectors. The cost of searching is further 
reduced by using a tree-based index. A precise, 
privacy-preserving, semantic-linguistic ranked 
search method over encrypted cloud data has been 
suggested. 


A Multi-keyword ranking cloud data search 
method that efficiently supports dynamic 
operations was presented by Guo et al. [20]. To 
improve the performance of the search, an index 
tree for the connected details was built using the 
Bloom filter. Furthermore, their method better 
supports a document’s numerous procedures that 
involve omissions or settings. The experiments 
revealed that their approach could efficiently and 
effectively meet the design goals. 


The idea of searchable encryption was first 
presented by He et al. [21] and uses attribute- 
based network access to enable hybrid Boolean 
keyword search on externally encrypted content. 
More expressive searches, like any necessary 
logical keyword expression search, can be carried 
out by authorised individuals. Additionally, the 
security of the system was demonstrated by the 
application of their security model. 
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An effective role-based authorised keyword 
search strategy for controlling hierarchical access 
permissions was suggested by Y. Miao et al. [22]. 
This was expanded to enable effective user 
management and token creation preprocessing by 
using it as a building piece. It has been finally 
established by formal means that the suggested 
approach resists both CKA and IGKA. Empirical 
trials have been conducted to confirm the 
method's effectiveness and viability in real-world 
situations. Future developments will involve 
evaluating a real-world prototype of the proposed 
method and investigating more expressive and 
effective multi-keyword — searches _—_ with 
hierarchical access control. Shen et al. [23] 
suggested a blockchain-based electronic medical 
record system with multi-keyword searchable 
encryption (BMSE). There are two sections to the 
plan. On the one hand, our method uses symmetric 
data encryption using the advanced encryption 
standard (AES) in conjunction with blockchain 
technology. We also encrypt the search index 
using attribute-based encryption (ABE). This 
method aims to solve the problem of centralised 
CSP's overbearing authority, which might 
compromise patient privacy. However, to address 
the issue of the low efficiency of the currently 
available multi-keyword searchable encryption 
methods, we employ the K-means algorithm to 
cluster the documents and use the relevance score 
of keywords and documents as the search index. 
Ultimately, we confirm the security of BMSE via 
safety analysis, and empirical research indicates 
that BMSE enhances search effectiveness. 


By utilising Intel SGX, Liu et al. [24] present a 
dynamic multi-client fuzzy keyword search 
technique that may ensure forward privacy at the 
cost of multiple trapdoor communication. With 
the help of Intel SGX, the suggested approach can 
lessen client-side processing and communication 
overload. Furthermore, we present an enhanced 
multi-client fuzzy keyword search method that 
maintains forward privacy while the 
compromised user is present. Our methods are 
suitable for real-world applications and can offer 
the necessary security level, as demonstrated by 
the efficiency and security evaluation. 

For blockchain-assisted cloud-edge storage, Liu 
et al. [25] presented a reliable and strong multi- 
keyword search (TRMS) that allows data users to 
select between a more thorough search based on 
cloud servers or a faster search based on edge 
servers. Our approach involves deploying a 
blockchain-based smart contract to run the search 
algorithm and update the score-based trust 


management model, thereby enabling the search 
for dependable servers and trustworthy search 
results. This will enable the recording and 
publication of trust scores and search results on 
the blockchain. Data consumers can determine 
whether the papers that are returned are top-k 
documents by perusing the search results. 

An effective confirmed privacy-preserving 
Boolean range query with suppressed leaking was 
presented by Q. Tong et al. [26]. First, we use the 
Bloom filter and Grey code to transform BRQ into 
a multi-keyword query. Then, by integrating the 
distributed point function and PRP-based Cuckoo 
hashing, we accomplish effective oblivious multi- 
keyword inquiry while protecting the access and 
search patterns. Additionally, by employing 
aggregate MAC, keyed-hashing MAC, oblivious 
query, and XOR-homomorphic pseudorandom 
function, we provide lightweight and oblivious 
result verification. It allows query users to utilise 
a proof whose size is independent of the size of 
the outsourced dataset to confirm the accuracy of 
the results. Lastly, our suggested system is 
efficient and adaptively secure for real-world 
applications, as shown by thorough experiments 
and formal security analysis, respectively. 

A multi-keyword searchable encryption system 
was suggested by Wang et al. [27] to increase the 
efficiency and security of trustworthy exchange of 
sensitive material. The plan creates a blockchain- 
based architecture for sharing private data, and the 
distributed ledgers' tamper-proof functionality 
guarantees the authenticity of encrypted data and 
indexes in addition to sharing behaviour 
monitoring. Based on this, we refined the inverted 
index structure to achieve effective multi- 
keyword searchable encryption and prevent 
keyword-pair result pattern leaking. The results of 
the simulation demonstrate the effectiveness of 
the multi-keyword searchable — encryption 
technique, and the recommended fix has 
withstood a thorough security examination. 

A technique known as MKSABE-VaAR (multi- 
keyword searchable attribute-based encryption 
with verification and attribute revocation) was 
introduced by Shen et al. [28]. We first bundle 
many keywords into a polynomial to enable multi- 
keyword search, addressing the issues of latency, 
unnecessary computation during the search 
process, and only permitting single-keyword 
search in most  attribute-based searchable 
encryption techniques. This polynomial helps 
MKSABE-VaAR to increase search efficiency 
and decrease the number of bilinear pairing 
operations required for search by decreasing the 
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amount of search calculation for keyword 
ciphertext. Simultaneously, we have developed a 
specific indexing architecture to incorporate user 
attribute verification in the keyword search 
procedure, hence enhancing search precision. 


2. PROPOSAL METHODOLOGY 


Search phrases are used to search the encrypted 
data is possible using Searchable Encryption (SE). 
However, most of the existing SE systems are 
incapable of dealing with shared records with 
hierarchical topologies. The SE approach, which 
permits cloud servers to access encrypted data on 
behalf of data owners without affecting data 
confidentiality, has significantly improved the 
security, speed, and relevancy of searching 
through Encrypted data. This paper suggests a 
cross-lingual, multi-keyword Centroid Merkle 
search over encrypted cloud data, called 
CLCMSE. The Centroid algorithm is used for 
clustering the cloud data because the cloud 
consists of a significant amount of data. After the 
cloud data is divided, the clustered data is sorted 
using the recommended clustering method. The 
Merkle search approach is utilised to make the 
search process go faster. 


Additionally, the secure data from the cloud is 
retrieved using an effective retrieval technique. 
The fuzzy retrieve technique is used to retrieve 
absolute data from cloud storage in a secure 
manner safely. A significant amount of data is 
efficiently clustered using the machine learning 
Centroid approach. Also, the speed of the search 
is being increased by the Merkle search method. 
The enhanced encrypted keyword searching 
process retrieves the exact data from the cloud 
storage, and also the proposed system used 
Multilingual Wordnet for Multilanguage keyword 
search. 


Search query node 
for Multi-Keyword 


Cloud 
Storage 


Select centroid data with highest 
D-vector among 
all the data 


data searching by 
Merkle search 


— | 


Fuzzy retrieve data| 
from cloud 


Sent back result on 
the reverse path 


Fig.1.Suggested Model 


The general operation of the suggested encrypted 
keyword search method is depicted in the diagram 
above. The input used in this procedure is a query 
word, which is the user's desired search term. 
Initially, the user queries the server with a request. 
Encrypted data is the search term, and the system 
goes into cloud storage. The enhanced centroid 
clustering algorithm is done in clustering and 
sorting the data from the cloud storage. The 
Centroid algorithm clusters a large amount of data 
based on selecting centroid data with the highest 
D-vector process. Now the numerous amounts of 
data have been clustered and sorted to get the 
exact data. Next, implementing the searching 
algorithm for efficient searching. The proposed 
Merkle searching technique explores the targeted 
data using point-to-point applications, improving 
searching efficiency and increasing the searching 
speed. 

Implementing the retrieval technique retrieves the 
data from the cloud based on the membership 
function. This proposed system employs the fuzzy 
retrieval technique to retrieve data from cloud 
storage efficiently. By using the suggested Cross- 
Lingual Multi-keyword Centroid Merkle Search 
over Encrypted Data (CLCMSE) scheme, the 
keyword searching procedure allows the user to 
query and retrieve data from the cloud. The 
recommended  Cross-lingual _multi-keyword 
Centroid Merkle search over encrypted data 
(CLCMSE) method is shown in Figure 1. The 
introduced method increases satisfaction results, 
and speed and enhances retrieval techniques to 
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ensure cloud data security. The enhanced keyword 
search technique gives the absolute result based 
on the user’s search word. 

Research Contribution: 


Recently, there has been a notable surge in 
researchers' interest in cross-lingual multi- 
keyword Centroid Merkle search. Studies in this 
area focus on developing solutions for accurate 
results in various languages. In this paper, we 
make the following contributions. 

e We believe we are the first to investigate 
cross-lingual multi-keyword ranked 
search over encrypted cloud data. Our 
system enhances cross-lingual target 
query capabilities by leveraging Open 
Multilingual WordNet (OMW) with 
language conversion and semantic 
extension, aiming to overcome language 
barriers in searchable encryption. 

e The flexible keyword and language 
preference settings, along with 
automated scoring based on semantics, 
enable intelligent and personalised 
sorting search, enhancing accuracy 
across all languages. 

e We assess our scheme's performance 
based on accuracy, security, and speed 
through extensive experiments. 

3.1. Clustering 


Data clustering is a complicated method in 
machine learning, pattern recognition, image 
processing, and information extraction domains. 
Using similarity metrics (such as Euclidean 
distance) to arrange related data into a single 
cluster is known as clustering. Even though 
several data clustering strategies have been 
presented. Due to the inefficiency of the similarity 
metrics used in traditional clustering approaches, 
they typically perform poorly on_high- 
dimensional data. Furthermore, on large-scale 
datasets, these approaches have a significant 
computation time [29]. In this paper, the Centroid 
clustering technique is proposed. The data from 
the cloud is clustered using the Centroid clustering 
technique, and the data is accurately clustered 
using the suggested clustering algorithm. 


3.2. Searching 


The data-searching mechanism is required to 
search and receive the shared data by authorised 
devices. There are currently few solutions 
available to handle the difficulties of safe data 
sharing and cloud search. Because the basic goal 
of cloud storage is to reuse data in the future, data 


consumers must be able to rapidly and precisely 
identify a specific group of data [30]. In this paper, 
the enhanced Merkle searching technique is used 
for the data searching process. The proposed 
searching technique works efficiently with the 
keyword searching process. It also searches the 
data from the cloud very fast. 


3.3. Data Retrieval 


Many real-world approaches depend heavily on 
information retrieval, such as online databases, 
expert findings, internet browsing, and so on. 
Information Retrieval is the process of extracting 
information resources from enormous collections 
relevant to a specific information demand. 
Because there may be various relevant resources, 
the retrieved results are usually sorted by some 
criterion of significance [31]. The process of 
extracting text, images, or multimedia content 
tailored to a query from online resources is known 
as data retrieval (IR). Diagonal searches, in 
general, retrieve information with substantially 
less precision than topical or perpendicular 
searches. Consequently, scientists are 
continuously working to improve information 
retrieval methods to improve accuracy [32]. This 
research uses the fuzzy retrieval technique to 
enhance data retrieval to retrieve secure data from 
cloud storage. This process involves searching the 
clustered cloud data, after which the data is 
securely retrieved from the cloud by the retrieval 
system. 
4. PROPOSED CENTROID MERKLE 
SEARCH RETRIEVAL METHOD 
The technique of grouping a set of objects into 
non-overlapping groups is known as clustering 
analysis. Every subset consists of a group of items 
that are similar to each other, not those in other 
groups (intra-similarity), (inter-dissimilarity). 
There are numerous methods for clustering. Based 
on partitioning, two of the most popular clustering 
methods are the K-means and K-medoids 
algorithms. The K-medoids clustering algorithm 
is more precise and resistant to noise and 
variations than the K-means _ technique, 
considering its slower processing speed. As a 
result, the K-medoids approach is also popular. 
Consider that the data set D = {c, cz. Cn }(1) 
consists of n objects, each of which has m 
Characteristics c; € P”™. The term dist (c; ,c;) 
refers to the distance between the objectsc; and 
cjin D.The fast K clustering technique selects K 
medoids 

b(i),i=1,....K (2) 

whereB={b, ,b3z,..... by} S D, (3) 
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then, based on the distance dist (-,"), allocates each 
object in D to the nearest medoids, then K clusters 
L, Lz ,...L, are designed to lower the total 
cluster cost F, 

F-Yh_. Yicexs dist ,b)? (4) 
Formulation 1: The object distance dist (-,-) is 
defined as the Euclidean distance. 


t=1 (cj a c;) (5) 


where m is the object’s attribute number. 
Formulation 2: The variation of data set D is 
expressed as: 


o= |—YE, (dist; C) (6) 


dist (Ci, or) = 


= Vier ci/n (7) 
is the object mean. Thec;variance object is 
defined as follows: 


1 7 
Oj = — (disti, Cj 7 (8) 


The variance represents the data degree of 
divergence. In Eq. (3), the difference between all 
items and their average is calculated, whereas in 
Eq. (4), the difference between c; i and all other 
objects are calculated. Since the outliers have high 
variance because they are generally located 
distant from the central region, the larger an 
object’s variance, the less probable it is to be a 
medoid. The proposed Centroid clustering 
algorithm clusters the cloud data accurately. But 
this technique is not to do the other keyword 
search process like searching and retrieving data 
from the cloud. So, the improved search algorithm 
is implemented in this paper. 

The Merkle tree, also known as a Hash tree, was 
proposed by Ralph Merkle. Block size hash 
values are in the leaf nodes, and child hash values 
are in the parent nodes. The ternary tree is a three- 
node data structure with three child nodes: left, 
middle, and right. The proposed methodology 
uses it as a hash tree. In the ternary hash tree, each 
parent node has exactly three child nodes. The 
data integrity of the massive tree data blocks is 
validated using the generated root hash value. 
Merkle Search Trees produce an ordered set of 
elements in a space S that are ultimately sorted. 
To create maps, elements can be paired with tags 
from a set U (the values, in which case S becomes 
the set of keys). U presents a default element 
(‘bottom’), which signals a key missing from the 
map 

f:S> U. (9) 
IfU isa CLCMSE with a merge operation Ll ysuch 
that 
Vy,lUuy=yUy L=y (10) 


Merkle Search Trees then apply CLCMSE on S> 
U as specified by a point-to-point application of 

Uy: (EU bh) (y) = fly) Uh). (1) 
The sequence of states a CLCMSE takes is meant 
to be monotonic concerningL], a transition from a 
state y to a state y must be of the form 

y' =yUo. (12) 
In Merkle Search Trees to those that produce 
monotonic sequences for each key, this means 
limiting the operations that can be employed. As 
a result, instead of using the operations put and 
delete directly, updates of the type update (f; s; v) 
= put should be used. 

(f; s; get (f§ s) Llu). (13) 

By choosing Uproperly, this technique can get 
different types of data. For instance if 

U={1L,T} (14) 
To get an improved set defined on S, a Boolean 
indicates if an item is available. To get a crucial 
store with last-writer-wins reconciliation, if U is 
one last register with the latest version. 
As the value type U, any current CLCMSE type 
can be used, resulting in a map CLCMSE 
construction that efficiently identifies distinct 
things. The Merkle searching algorithm 
completes the search process, and ultimately, this 
system implements a data retrieval approach to 
recover data from the cloud, in this case, the fuzzy 
retrieval system. The fuzzy method works well 
with data retrieval systems.TheMerkle searching 
algorithm is done through the searching process. 
Finally, this system implements the data retrieval 
technique for retrieving the data from the cloud; 
here, the fuzzy retrieval system retrieves the data 
from the server. 
The fuzzy technique functions well for the system 
that retrieves data. The linguistic word ‘LV,’ 
describes all variables (input or output), according 
to the idea of fuzzy sets. Linguistic values 
represent the value y/ that fits into the 
conversation environment LV;’. The discourse's 
universe elements are partially "belonging to" the 
linguistic value [0,1], which is supplied by a 
membership function p (LY;), utilising FRS;, as 
the inputs (fuzzy retrieval system). The clear 
values are different. 

Hj(LV,)=PRSiy > [0,1] (15) 


The whole set value is represented by the value 1. 
The fuzzification procedure must change the crisp 
values instead of a fuzzy value, and different 
techniques are used for singleton fuzzification. 
The law of If Else inference governs how i/p is 
assigned to o/p. To extract assumptions from both 
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the rule base and i/p, an inferential phase is 


necessary. 
IF 
LV; > yiLVi > yPLV; > yPLVi o> 
yi (16) 
Then, 
a = Gil. )y} (17) 
The fundamental argument is given by “.” in the 


preceding steps. A method containing the term 
data is generated using the Takagi-Sugeno 
inference. The defuzzification approach is 
required at the end of the process to obtain crisp 
output data. 


oS Gin; 
Z==>— 1 
ye i ( s) 
The process of fuzzification is the transformation 
of given input data into knowledge 


representations. Next, FL has created functions 
primarily based on “categories” that are easier for 
physicians to recognise and _ information 
clustering with similar qualities for decision- 
making operations. Finally, the fuzzy retrieve 
system retrieves the data from the cloud 
efficiently. 


3. RESULT 

In this section, to run many tests to see how well 
the CLCMSE-suggested search method works. To 
test this strategy, use a PC with a 3.40GHz 
Intel®Core TM 17-6700 processor and 16.0GB 
RAM. The Eclipse integrated development 
environment and the Java programming language 
are used to implement each algorithm and 
protocol in CLCMSE. Using the Natural 
Language Tool-Kit wordnet interface (NLTK), 
which also retrieves the open multilingual 
wordnet corpus, the query extension is created. To 
access the precision of the presented search 
scheme is compared to other techniques in terms 
of performance and efficiency. The proposed 
scheme’s satisfaction result determines whether 


Table 2. Index tuple storage overhead with 
varying document collection sizes 
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the suggested results match the needs of the data 
users. Your search results will be more accurate 
depending on the type of extended language you 
choose because the OMW (Open Multilingual 
Wordnet) supports multiple languages. We tested 
our approach on one hundred native speakers of 
different popular languages to conduct this study. 
The target language in this experiment is the 
participants’ native tongues, while the query 
language is English. 


Tablel. Accuracy of the major languages spoken 
throughout the world 


Russian 70% 10% 
Portuguese 5% 
Japanese 98% 2% 
German 72% 11% 
Indonesian 99% 0% 
Chinese 63% 9% 
Relevant data were categorised using the 


satisfaction, basic satisfaction, and dissatisfaction 
categories. To counteract the negative effects of 
dataset selection, we generate datasets in many 
languages using almost identical keyword 
collections on the result. We recorded the input 
from the participants in Table 1. 
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MRSE 

Sie | EDS | (eires)| 23) | aes 
(MB) 
CERSE | 10.7 | 21.44 | 32.16 | 42.88 | 53.6 
(MB) : 
CLCMSE 

12.1 | 23.65 | 35.1 | 43.96 | 61.49 
(MB) 


Table 2 also contrasts the index tuple storage 
overheads between the two systems, taking into 
account the varying document collection sizes and 
keyword counts. Total 4000 documents are used 
in the experiment which contains two keywords in 
each document. 
Table 3. Storage overhead of index tuples in the 
document containing varying numbers of 

keywords 


ie ero eee een fea 
ED te || os | ae | ce 
CURSE Nis leo ial noomles7 anna: 
B) 64 | 44° | 24 | 03 | +83 

(5 |oau 0m ae eas! 
Es | ap on le |e 


The index tuple storage overheads in the two 
systems with varying numbers of keywords per 
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document are contrasted in Table 3. The findings 
show that compared to the MRSE method, the 
suggested CLCMSE method has a considerably 
higher index storage overhead. 

Table 4.Execution time of proposed searching 
scheme with the different bit length of S. 


CS | 04 | 11 | 1.95 | 3.6 | 10. | 36.4 
P 25 | 45 8 32 || 25 26 
Itoi | 28) | S2 || 20. | 43, | So, | 205. 
al | 24 | 15 16 AS || 2 45 


The proposed technique also considered the 
processing speed. The enhanced searching 
algorithm increasing the processing speed also 
reduced the time. The proposed searching 
scheme’s execution time with varied bit lengths of 
S (N) is The query response period is recorded in 
this experiment to determine query efficiency, 
which relates to the time between issuing the 
query request and receiving the search results. Set 
up a multilingual dataset with 2048 encrypted 
documents and safe correlated indices in Chinese, 
English, and other languages to test response 
times. The original query has four keywords. T is 
the threshold value of 0.5. In comparison to the 
MRSE schemes, Figure 3 shows the overhead of 
our basic and modified CLCMSE approaches vs 
the quantity of returned results. In comparison to 
the MRSE system, our scheme necessitates an 
additional query extension phase. This technique 
has a very low estimate using cp, csp, and the 
overall process, as shown in the table. 4 
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Execution Time of S(N) 


Ax6 Title 


Fig.2.Graphical representation of Execution time 
of the proposed searching scheme time overhead. 


The proposed approach achieves reasonable 
satisfaction and works  Fig.2.Graphical 
representation for the Execution time of proposed 
searching scheme (S) with a different bit length 
efficiently. 


Table.5.Execution time of proposed searching 
scheme (S) with a different bit length 


20.2 


Furthermore, our improved CLCMSE approach 
builds the initial heap and modifies the heap shape 
dynamically. Merkle searching does not require 
running the whole K-rounds sorting algorithm 
with this technique. The greater the amount of N, 
the more efficient the query will be overall. Table 
5 indicates all protocols that operate on the CP and 
CSP can be run concurrently, resulting in much- 
enhanced method execution performance. 


x 


KS 
oO 
l 


ies) 


: csP 
0 -lekell... oh _| Total 
1 2 3 4 


bP 
jo) 


2 
> 
fm 
) 
5 

oO 

Sent 

io) 
>) 

£ 
ia 


Number Of Returned Results 


Figure 3. Execution time of proposed searching 
scheme (S) with different bit lengths. 


The larger the value N, the higher overall query 
efficiency can be achieved. Fig 3 indicates the 
efficiency of all protocols that make use of the CP 
and CSP can be improved by running them all 
simultaneously. 

5.1 Discussion 

This research effectively resolves the problem of 
Cross-lingual multi-keyword rank search with a 
semantic extension over encrypted data. Through 
lingual-multi keywords and language preference 
settings, our CLCMSE scheme also achieves 
intelligent and personalized search. We evaluate 
the performance of our scheme in terms of index 
tuple storage overheads, variable number of 
keywords per document, execution time with 
varied bit length of S(N) using CP, CSP.The 
Merkle search technique increases the search 
speed when compared with other methods such as 
cross-lingual multi-keyword rank search, Multi- 
keyword Rank Searchable Encryption, Verifiable 
Attribute multi-keyword search over encrypted 
cloud data. The proposed method proved the 
accuracy result in all languages; specifically, 
Indonesian (99%) and Japanese(98%) are 
achieved. Practically, expressive search queries 
should be supported because single keyword 
searches may yield many irrelevant results and 
decrease user search experience. 


6. CONCLUSION 


Traditional and broad data retrieval relies on 
keyword searches, which have limitations like a 
high manpower need and a dependence on private 
data, leading to poor simulation outcomes.To 
overcome the above issue, we have suggested the 
CLCMSE technique. This technique allows data 
users to query in any language and choose the 
linguistic kind of information.The problem of 
cross-lingual multi-keyword rank search over 
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encrypted material with a semantic extension was 
solved by this work. A cross-lingual multi- 
keyword rank search system (CLCMSE) based on 
OMW is provided as a solution to this problem. 
The presented system created a_ significant 
advance in searchable encryption by removing 
linguistic restrictions in searchable encryption. 
This system also features keyword and language 
request parameters that may be customised and 
automated preference score computations. 
Extensive experiments are used to assess one’s 
performance of our plans, which are then 
compared to existing schemes. The enhanced 
centroid algorithm clustered the cloud data 
inefficiently. Also, the Merkle search technique is 
increasing the search speed. Finally, the enhanced 
fuzzy retrieval system securely retrieves the data. 
We also compare our method with other state-of- 
the-art methods such as Cross-lingual multi- 
keyword rank search, Multi-keyword Rank 
Searchable Encryption, and Verifiable Attribute 
multi-keyword search over encrypted cloud data. 
The suggested approach demonstrated accuracy in 
all languages; in particular, 99% and 98% of the 
results were obtained in Indonesian and Japanese. 
As a part of future work, we will continue to 
improve the security and further improve the 
response speed of this work. 
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